Topic
  • 19 replies
  • Latest Post - ‏2013-02-07T21:18:54Z by SystemAdmin
SystemAdmin
SystemAdmin
435 Posts

Pinned topic groups for new users?

‏2013-02-01T01:43:36Z |
I'm a new user of SPSS Modeler 15, have only taken self guided user courses, and know no one else that's licensed. I have a file I'd like to get started on for a client that seems rather straight forward, but I'm struggling with getting started. Are there any "new user" groups or groups by cities to act as support?
Updated on 2013-02-07T21:18:54Z at 2013-02-07T21:18:54Z by SystemAdmin
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-01T10:14:40Z  
    Hi - Can you describe the file problem you are having that's stopping you getting going? No reason not to use this forum to support you.

    Regards
    Sarah
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-05T20:19:29Z  
    Hi - Can you describe the file problem you are having that's stopping you getting going? No reason not to use this forum to support you.

    Regards
    Sarah
    Thank you. I've loaded an Excel file, and am hoping to find out if credit score is predictive of risk. The Modeler defaulted measurement types to some I wouldn't have expected. For example, I have columns with $amounts past due, and these defaulted to "Categorical." I would've thought they should be "Continuous." Maybe my assumption is wrong? Anyway, I've attempted to change they type, and it won't let me; it just flips it right back to Categorical. Thoughts?
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-05T20:38:50Z  
    Thank you. I've loaded an Excel file, and am hoping to find out if credit score is predictive of risk. The Modeler defaulted measurement types to some I wouldn't have expected. For example, I have columns with $amounts past due, and these defaulted to "Categorical." I would've thought they should be "Continuous." Maybe my assumption is wrong? Anyway, I've attempted to change they type, and it won't let me; it just flips it right back to Categorical. Thoughts?
    The $ character is causing Modeler to call this a string field and classify it as categorical. You can strip that character from the field -- either in Excel beforehand or within Modeler -- to create an integer field that will be continuous.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-05T21:25:11Z  
    Thank you. I've loaded an Excel file, and am hoping to find out if credit score is predictive of risk. The Modeler defaulted measurement types to some I wouldn't have expected. For example, I have columns with $amounts past due, and these defaulted to "Categorical." I would've thought they should be "Continuous." Maybe my assumption is wrong? Anyway, I've attempted to change they type, and it won't let me; it just flips it right back to Categorical. Thoughts?
    Hey Kath,
    Like Mark already mentioned, probably the formatting is not correctly done in Excel (although, to my experience this has nothing to do with the "$" in the name).

    To see the problem, you should look at another column of your Type node. At the left (just before the variable name), there is a symbol indicating the storage. in your node, this is probably a symbol "A", indicating this variable is a string. Now string variable can only be categorical. So what you want is to first change the storage and change the measurement in a second step.

    Now, how do you change this storage. First of all, Modeler just takes over the format of Excel. So the fact that he sees this as a string probably just means that in Excel it is formatted as a text column. To change this, select the entire column in Excel, right click-> Format Cells-> Number. This should solve the trick in Modeler.

    If however, you do not want to solve this within Excel, You can also fix it in Modeler. Just after your input node, place a filler node to "overwrite" the problematic variables. As condition put "Always" and as function put "to_integer(@FIELD)" or "to_real(@FIELD)", depending if you want it to be integers or reals. Placing a type node after the filler will see that your "A" has changed into a "#" (for reals) or a diamond (for integers).

    A more detailed description, you can find in the Link: user guide?
    Hope this helps.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-05T21:29:32Z  
    Hey Kath,
    Like Mark already mentioned, probably the formatting is not correctly done in Excel (although, to my experience this has nothing to do with the "$" in the name).

    To see the problem, you should look at another column of your Type node. At the left (just before the variable name), there is a symbol indicating the storage. in your node, this is probably a symbol "A", indicating this variable is a string. Now string variable can only be categorical. So what you want is to first change the storage and change the measurement in a second step.

    Now, how do you change this storage. First of all, Modeler just takes over the format of Excel. So the fact that he sees this as a string probably just means that in Excel it is formatted as a text column. To change this, select the entire column in Excel, right click-> Format Cells-> Number. This should solve the trick in Modeler.

    If however, you do not want to solve this within Excel, You can also fix it in Modeler. Just after your input node, place a filler node to "overwrite" the problematic variables. As condition put "Always" and as function put "to_integer(@FIELD)" or "to_real(@FIELD)", depending if you want it to be integers or reals. Placing a type node after the filler will see that your "A" has changed into a "#" (for reals) or a diamond (for integers).

    A more detailed description, you can find in the Link: user guide?
    Hope this helps.
    Sorry, wrong link, if you click through on "Measurement Levels" you are where you want to be...
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-05T21:47:17Z  
    Hey Kath,
    Like Mark already mentioned, probably the formatting is not correctly done in Excel (although, to my experience this has nothing to do with the "$" in the name).

    To see the problem, you should look at another column of your Type node. At the left (just before the variable name), there is a symbol indicating the storage. in your node, this is probably a symbol "A", indicating this variable is a string. Now string variable can only be categorical. So what you want is to first change the storage and change the measurement in a second step.

    Now, how do you change this storage. First of all, Modeler just takes over the format of Excel. So the fact that he sees this as a string probably just means that in Excel it is formatted as a text column. To change this, select the entire column in Excel, right click-> Format Cells-> Number. This should solve the trick in Modeler.

    If however, you do not want to solve this within Excel, You can also fix it in Modeler. Just after your input node, place a filler node to "overwrite" the problematic variables. As condition put "Always" and as function put "to_integer(@FIELD)" or "to_real(@FIELD)", depending if you want it to be integers or reals. Placing a type node after the filler will see that your "A" has changed into a "#" (for reals) or a diamond (for integers).

    A more detailed description, you can find in the Link: user guide?
    Hope this helps.
    (although, to my experience this has nothing to do with the "$" in the name)

    True, this not an issue with Excel files. CSV files, OTOH.... I find Modeler 14.1 slows to a crawl when reading large Excel files over the network (probably because it's sucking up all that additional data about field storage), so I typically use CSV files for sources and do data typing in Modeler.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-06T04:33:26Z  
    Thank you for the advice! I attempted to make the changes in Excel, but unfortunately, even changing the column formats to "number" didn't have an affect on Modeler v.15 reading them any differently. The dollar amounts still came across as strings. Then I attempted to make the changes in Modeler. I connected a Filler node to my Excel source node, and replaced using "to_integer." I connected a Type node and changed the measurements to Continuous. Then I connected a Distribution node so I could look at things, and got the following error:
    "Failed to type application of operator: and(LONG,LONG)" Any idea what this means, or where I can go to find out what it means?
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-06T04:57:59Z  
    Thank you for the advice! I attempted to make the changes in Excel, but unfortunately, even changing the column formats to "number" didn't have an affect on Modeler v.15 reading them any differently. The dollar amounts still came across as strings. Then I attempted to make the changes in Modeler. I connected a Filler node to my Excel source node, and replaced using "to_integer." I connected a Type node and changed the measurements to Continuous. Then I connected a Distribution node so I could look at things, and got the following error:
    "Failed to type application of operator: and(LONG,LONG)" Any idea what this means, or where I can go to find out what it means?
    Did you refresh your Excel source node after making changes to the source file? I can't recall whether Modeler refreshes that source when a downstream node is executed, but you may as well go into the source node and hit the refresh button to make certain you're reading the latest version of the file. I assume this process works in 15 as it does in 14.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-06T11:28:33Z  
    Thank you for the advice! I attempted to make the changes in Excel, but unfortunately, even changing the column formats to "number" didn't have an affect on Modeler v.15 reading them any differently. The dollar amounts still came across as strings. Then I attempted to make the changes in Modeler. I connected a Filler node to my Excel source node, and replaced using "to_integer." I connected a Type node and changed the measurements to Continuous. Then I connected a Distribution node so I could look at things, and got the following error:
    "Failed to type application of operator: and(LONG,LONG)" Any idea what this means, or where I can go to find out what it means?
    Hey Kath,

    You are on the right path. The fact that it gives this error means it is indeed an integer (good news!!).
    What this "and(LONG,LONG)" means is that somewhere, modeler has to perform a statement like "<integer> and <integer>" (for example "522 and 141"), which correctly does not make sense to him. This could for example be in the condition of your filler node.
    Now to go forward: Could you answer me the following questions:

    • Can you confirm your filler node looks like the screenshot attached (except for the variable name which is maybe different)?
    • Can you confirm you are on FP1 of V15? If not, can you install FP.
    • Could you confirm you pressed on "Read Values" in the Type node, after which the format of the field $amount changes to Integer?
    • You have 4 nodes, could you "preview" all of them except the Distribution node (by double clicking on the node and clicking Preview). Could you tell me as from which node you receive the first error?
    • Which variable did you place in the Distribution. Note that now, $amount is no longer a valid option to put here, as it is a numerical and distribution graphs are made for categorical. Try using a histogram instead if you want to use the $amount variable.
    • If you want, you can share your stream (and a sample anonymized dataset of a few lines) with me and I can have a better look.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T02:56:38Z  
    Hey Kath,

    You are on the right path. The fact that it gives this error means it is indeed an integer (good news!!).
    What this "and(LONG,LONG)" means is that somewhere, modeler has to perform a statement like "<integer> and <integer>" (for example "522 and 141"), which correctly does not make sense to him. This could for example be in the condition of your filler node.
    Now to go forward: Could you answer me the following questions:

    • Can you confirm your filler node looks like the screenshot attached (except for the variable name which is maybe different)?
    • Can you confirm you are on FP1 of V15? If not, can you install FP.
    • Could you confirm you pressed on "Read Values" in the Type node, after which the format of the field $amount changes to Integer?
    • You have 4 nodes, could you "preview" all of them except the Distribution node (by double clicking on the node and clicking Preview). Could you tell me as from which node you receive the first error?
    • Which variable did you place in the Distribution. Note that now, $amount is no longer a valid option to put here, as it is a numerical and distribution graphs are made for categorical. Try using a histogram instead if you want to use the $amount variable.
    • If you want, you can share your stream (and a sample anonymized dataset of a few lines) with me and I can have a better look.
    Thanks for sticking with me! My responses to each question are in caps:
    •Can you confirm your filler node looks like the screenshot attached (except for the variable name which is maybe different)? YES, EXCEPT THAT I'M DOING THIS FOR MULTIPLE FIELDS... DO I NEED TO DO ONE AT A TIME?...I'VE ATTACHED A DOC THAT SHOWS MY FILLER NODE.
    •Can you confirm you are on FP1 of V15? If not, can you install FP. I DO NOT KNOW WHAT YOU MEAN BY FP1...HOW DO I DETERMINE IF I'M ON FP1?
    •Could you confirm you pressed on "Read Values" in the Type node, after which the format of the field $amount changes to Integer? I HAD NOT PRESSED "READ VALUES" BUT TRIED IT AND GOT THE SAME ERROR MESSAGE.
    •You have 4 nodes, could you "preview" all of them except the Distribution node (by double clicking on the node and clicking Preview). Could you tell me as from which node you receive the first error? I FIRST RECEIVE THE ERROR IN THE FILLER NODE.
    •Which variable did you place in the Distribution. Note that now, $amount is no longer a valid option to put here, as it is a numerical and distribution graphs are made for categorical. Try using a histogram instead if you want to use the $amount variable. I REALIZED IT WASN'T AVAILABLE, AND CHOSE TO LOOK AT IT BY CREDIT SCORE.
    •If you want, you can share your stream (and a sample anonymized dataset of a few lines) with me and I can have a better look IF WHAT I'VE PROVIDED SO FAR DOESN'T GET US FURTHER, AND YOU THINK THAT WOULD HELP, I CAN DO THAT.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T03:09:37Z  
    Did you refresh your Excel source node after making changes to the source file? I can't recall whether Modeler refreshes that source when a downstream node is executed, but you may as well go into the source node and hit the refresh button to make certain you're reading the latest version of the file. I assume this process works in 15 as it does in 14.
    Instead of refreshing, I had just started over with importing the file.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T09:08:52Z  
    Thanks for sticking with me! My responses to each question are in caps:
    •Can you confirm your filler node looks like the screenshot attached (except for the variable name which is maybe different)? YES, EXCEPT THAT I'M DOING THIS FOR MULTIPLE FIELDS... DO I NEED TO DO ONE AT A TIME?...I'VE ATTACHED A DOC THAT SHOWS MY FILLER NODE.
    •Can you confirm you are on FP1 of V15? If not, can you install FP. I DO NOT KNOW WHAT YOU MEAN BY FP1...HOW DO I DETERMINE IF I'M ON FP1?
    •Could you confirm you pressed on "Read Values" in the Type node, after which the format of the field $amount changes to Integer? I HAD NOT PRESSED "READ VALUES" BUT TRIED IT AND GOT THE SAME ERROR MESSAGE.
    •You have 4 nodes, could you "preview" all of them except the Distribution node (by double clicking on the node and clicking Preview). Could you tell me as from which node you receive the first error? I FIRST RECEIVE THE ERROR IN THE FILLER NODE.
    •Which variable did you place in the Distribution. Note that now, $amount is no longer a valid option to put here, as it is a numerical and distribution graphs are made for categorical. Try using a histogram instead if you want to use the $amount variable. I REALIZED IT WASN'T AVAILABLE, AND CHOSE TO LOOK AT IT BY CREDIT SCORE.
    •If you want, you can share your stream (and a sample anonymized dataset of a few lines) with me and I can have a better look IF WHAT I'VE PROVIDED SO FAR DOESN'T GET US FURTHER, AND YOU THINK THAT WOULD HELP, I CAN DO THAT.
    Thanks for the answers.
    Ok, that gives more insight. So the problem is somewhere in the Filler node...
    Unfortunately, I cannot find your attachment, but if you say it looks like the screenshot I send, I believe you :)
    I guess it is easier to share the stream and some data, to see exactly what we are working with. As I am really a bit puzzled on this. So if you could do that, I can try your stream to see what exactly goes wrong. If you do not want to place it publicly, you can send me a private message.

    About FP1, that is FixPack 1 of modeler 15. It is a pack of some corrections of known bugs. You can find out if this installed by opening Modeler, Help->About->Additional Details. Then you get a window with your licenses, but on top of that, there is the version number of the software. If it is 15.0.0.1, it means that FP1 is installed. If it is 15.0.0.0, no FP1 is installed, and I would suggest to do so.

    You can find the FP at http://www-01.ibm.com/support/docview.wss?uid=swg24033607.

    Once this installation is complete, and it still does not work, you can send me stream+excel.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T16:50:02Z  
    Thanks for the answers.
    Ok, that gives more insight. So the problem is somewhere in the Filler node...
    Unfortunately, I cannot find your attachment, but if you say it looks like the screenshot I send, I believe you :)
    I guess it is easier to share the stream and some data, to see exactly what we are working with. As I am really a bit puzzled on this. So if you could do that, I can try your stream to see what exactly goes wrong. If you do not want to place it publicly, you can send me a private message.

    About FP1, that is FixPack 1 of modeler 15. It is a pack of some corrections of known bugs. You can find out if this installed by opening Modeler, Help->About->Additional Details. Then you get a window with your licenses, but on top of that, there is the version number of the software. If it is 15.0.0.1, it means that FP1 is installed. If it is 15.0.0.0, no FP1 is installed, and I would suggest to do so.

    You can find the FP at http://www-01.ibm.com/support/docview.wss?uid=swg24033607.

    Once this installation is complete, and it still does not work, you can send me stream+excel.
    Embarrassed to ask, but exactly how do i send you the Stream and file?
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T18:26:48Z  
    Embarrassed to ask, but exactly how do i send you the Stream and file?
    You can just include it belw the message window. Just browse and include.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T19:42:58Z  
    You can just include it belw the message window. Just browse and include.
    Did it all come through okay?
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T20:07:48Z  
    The stream came through ok. As there is a maximum of one file, I did not receive any data, but that is fine.

    Anyway, it was enough to see the problem:
    In the "replace with" you have a formula like
    
    to_integer(field1) and to_integer(field2) and ... and to_integer(fieldX)
    

    That is where things go wrong. This "and" is the logical function, so it can only handle true or false values, not integers. You now ask modeler to calculate something like "10 and 1572 and ... and 142", which does not make sense, (and causes the error).

    Now, what should it be? Use the @FIELD option in combination with (only ONE) to_integer function, and everything will be fine. See screenshot for full filler node options.

    What happens exactly, in the "fill in Field", you have specified 7 different variables. Now, you want to replace this with the integer form of whatever is the considered field, and that is exactly what @FIELD does. It will evaluate this formula 7 times, each time replacing the @FIELD by the considered field. If this explanation is not clear, you might search through the manual to see what this @FIELD does exactly.

    Too bad I did not got the excel, but in this way it is anyway solved within modeler
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T20:20:55Z  
    The stream came through ok. As there is a maximum of one file, I did not receive any data, but that is fine.

    Anyway, it was enough to see the problem:
    In the "replace with" you have a formula like
    <pre class="jive-pre"> to_integer(field1) and to_integer(field2) and ... and to_integer(fieldX) </pre>
    That is where things go wrong. This "and" is the logical function, so it can only handle true or false values, not integers. You now ask modeler to calculate something like "10 and 1572 and ... and 142", which does not make sense, (and causes the error).

    Now, what should it be? Use the @FIELD option in combination with (only ONE) to_integer function, and everything will be fine. See screenshot for full filler node options.

    What happens exactly, in the "fill in Field", you have specified 7 different variables. Now, you want to replace this with the integer form of whatever is the considered field, and that is exactly what @FIELD does. It will evaluate this formula 7 times, each time replacing the @FIELD by the considered field. If this explanation is not clear, you might search through the manual to see what this @FIELD does exactly.

    Too bad I did not got the excel, but in this way it is anyway solved within modeler
    So, if I want to change more than one field, I should use a separate filler node for each one?
    Here is the Excel file. I thought it would've come accross in the stream.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T20:46:02Z  
    So, if I want to change more than one field, I should use a separate filler node for each one?
    Here is the Excel file. I thought it would've come accross in the stream.
    No, you can use one filler to do this, as long as they all use the same formula, with the @FIELD to point to the field of interest (7 at the same time), which in this case they do. Just make sure everything in your filler node looks exactly like in the screenshot. That should do the trick.
  • SystemAdmin
    SystemAdmin
    435 Posts

    Re: groups for new users?

    ‏2013-02-07T21:18:54Z  
    So, if I want to change more than one field, I should use a separate filler node for each one?
    Here is the Excel file. I thought it would've come accross in the stream.
    Ok, attached you can find the 2 solutions.
    One with a solution in Excel , one in modeler.

    • In Excel, your data is not so clean as it looks. the "empty" values are not really empty as they are filled with whit spaces. So if you clean your excel file with the empty cells being actually empty, there is no problem in Modeler. A clean version of excel is attached so you can see it might work easily.

    • In modeler, the solution is like I specified before. Put one filler node to change the format from some texts to reals. However, there is another thing you should realize here. Since they are strings, he also copies the grouping and decimal symbol that might occur in Excel. So I told modeler to see the comma as a decimal symbol (in Tools->stream Properties->Options), and I removed the grouping symbol in the filler node.

    So now you have the 2 options. Up to you to choose the easiest one...